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Abstract 

Wc develop a new forniulation of Stein's method to obtain com- 
putable upper bounds on the total variation distance between the ge- 
ometric distribution and a distribution of interest. Our framework 
reduces the problem to the construction of a coupling between the 
C^ , original distribution and the "discrete equilibrium" distribution from 

renewal theory. We illustrate the approach in four nontrivial exam- 
ples: the geometric sum of independent, non-negative, integer-valued 
^_ random variables having common mean, the generation size of the 

►^ , critical Galton- Watson process conditioned on non-extinction, the in- 

^^ ' degree of a randomly chosen node in the uniform attachment random 

t^^ , graph model, and the total degree of both a fixed and randomly cho- 

r~~^ ' sen node in the preferential attachment random graph model. In the 

first two examples we obtain error bounds in a metric that is stronger 



(N 

lO ' than those available in the literature, and in the final two examples we 



o 
o 



provide the first explicit bounds. 



1 INTRODUCTION 



r> ' The exponential and geometric distributions are convenient and accurate ap- 

C^ ■ proximations in a wide variety of complex settings involving rare events, ex- 

tremes, and waiting times. The difficulty in obtaining explicit error bounds 
for these a,pproximation s beyond an elementary setting is discussed in the 



preface of lAldous 



(Il989l'l. where the auth or also points out a lack of such 



results. Recently, iPekoz and RollinI (J201ll ) developed a framework to obtain 
error bounds for the Kolmogorov and Wasserstein distance metrics between 
the exponential distribution and a distribution of in t erest. The main ingre- 
dients there are Stein's method (see lRoss and Pekoa (J2007l ) for an introduc- 



tion) along with the equilibrium distribution from renewal theory. Due to 



Version from August 25, 2011, 06:50 



the flexibility of Stein's method and the close connection between the expo- 
nential and geometric distributions, it is natural to attempt to use similar 
techniques to obtain bounds for the (stronger) total variation distance metric 
between the geometric distribution and an integer supported distribution. 
The purpose of this paper is to obtain such bounds having application in 
both situations where exponential approximation is and is not available. 

There are, however, some major complications that arise in trying to 
carry over approaches for the exponential to the geometric and the stronger 
total variation metric. To see this, we will first discuss the relationship be- 
tween our results for approximation by the geometric distribution and the 
bod y of literature devo t ed to approximation by the exponential distribution 



(see iPekoz and RollinI (J201ll ) and references therein). For our purposes, 
the most pertinent previous efforts focus on determining dK{^ {Z) , ^ {W)) , 
where Z is an exponential random variable with rate one and VF is a mean 
one random variable; for random variables U and V, we define the Kol- 
mogorov distance by 

dK{^{U),^{V)) := sup|F(^ ^ x) - F{V ^ x)\. 

X 

If VF is a non-negative random variable, Z has the exponential distribu- 
tion with rate one and X has the geometric distribution with parameter p, 



the tr iangle inequality and dyi{^{Z),^{pX)) ^ p from iPekoz and Rollin 



(J201ll . Theorem 3.1) give 

\dK{^{Z),^{W)) - dK{^{pX),^{W))\ ^ p. (1.1) 

Alternatively, in the case that W has the form YfEiY, where 1" is a positive 
integer- valued random variable, we can compare the distribution of Y to the 
geometric distribution in the total variation distance, which is a standard 
measure between the distributions of integer-valued random variables U and 
V defined by 

dTY{-^{U),^{V)) := sup \F{U £ B) - F{V G B)\. 

Prom this point, it is appropriate to discuss the implications of (jl.ip 
in determining the total variation distance between the distribution of an 
integer-valued random variable and the geometric distribution. First, note 
that 

dKi^{U),^{V)) ^ dTy{^{U),^{V)), (1.2) 



since the supremum on the right hand side is taken over a larger set. Thus, 
an upper bound on the variation distance between the distribution of two 
random variables immediately implies the same bound on the Kolmogorov 
distance. Secondly, there no useful inequality which is converse to ()1.2p ; for 
example, if C/„ is a random variable which is uniformly distributed on the 
set of even integers between 1 and 2n and Vn is uniformly distributed on the 
set of odd integers in the same range, we see d^i^ (Un) , ^ (Vn)) = l/n but 
dTv{^{Un),^{Vn)) = 1- Moreover, there is no canonical method to finesse 
results from the Kolmogorov distance to the total variation distance. That 
is, even in the case where approximation by the exponential distribution is 
fruitful and known results can be applied, it is not clear how to obtain our 
results below from those existing. 

Beside the total variation metric, we will also give bounds on the local 
metric 

dioc{^{U),^{V)) := sup \F{U = m) - P{V = m)\. 

It is clear that dioc will be less than or equal to sup^[F(C/ = m)VF(F = m)], 
so that typically better rates need to be obtained in order to provide useful 
information in this metric. 

Our formulation rests on the idea that a positive integer-valued random 
variable W will be approximately geometrically distributed with parameter 
p = IfEW if J^{W) ss ^(W^), where W^ has the (discrete) equihbrium 
distribution with respect to W defined by 

k 

F{W' ^ k) = ^Y.V{W ^ i),k = 1,2,... (1.3) 

j=i 

This distribution arises in discrete-time renewal theory as the time until 
the next renewal when the process is stationary, and the transformation 
which maps a distribution to its equilibrium distribution has the geometric 
distribution with positive support as its unique fixed point. Our main result 
is an upper bound on the variation distance between the distribution of W 
and a geometric distribution with parameter (1EVF)~^, in terms of a coupling 
between the random variables W^ and W. 

This setup is closely re l ated to the ex ponential approx imation formu- 



lation of iPekoz and RollinI ()201ll ) and al so iGoldsteinI (120091) . which is also 



related to the zero-bias transformation of Goldstein and Reinert ( 1997 ) . As 



discussed above, a serious difficulty in pushing the results of iPekoz and Rollin 



(|201ll ) through to the stronger total variation metric is that the support of 



the distribution to be approximated may not match the support of the geo- 
metric distribution well enough. This issue is typical in bounding the total 
variation distance between integer-valued random variables and can be han- 
dled by intr oducing a term into the bound that quantifi e s 'srn o othne ss,' see, 
for example iBarbour and CekanaviciusI ( 20021 ): iRollinI ( 20051 . |2008| ). Even 



with this difficultv. in many of the situations where the ideas of Pekoz and 
Rollin ()201ll ) can be applied, the results here will yield comparable state- 
ments in a stronger metric. To illustrate this point, we apply our abstract 
formulation to obtain ri ew error bounds in two of the examples treated in 
Pekoz and Rohinl (|201lh . 

The geometric distribution may also arise as a limiting distribution where 
exponential approximation is not available. Thus, we will also apply our 
theory in two examples (discussed in more detail immediately below) that 
fall into this category. We remark here that these two examples are more 
naturally suited for geometric approximation so that the technical difficulties 
discussed above relating to the period of the distribution do not arise. 

The first application is a bound on the total variation distance between 
the geometric distribution and the sum of a geometrically distributed num- 
ber of independent, non-negative, integer-valued random variables with com- 
mon mean. The distribution of such geometric convolutions have been con- 
sidered in many places in the literature in the setting of exponer itial approxi- 
matio n and convergence; the book-length treatment is given in iKalashnikovl 
(119971). The s econd application is a variation on the classical theorem of 
YaglomI (119471 ) describing the asymptotic behavior of the generation size of 
a critical Galton- Watson process conditioned on non-extinction. This theo- 
rem h as a large literature of extensions and embellishments fsee Lallev and 



Zheng (jin pressi ) for example). iPekoz and RollinI (J201ll ) obtained a rate of 



convergence for the Kolmogorov distance between the generation size of a 
critical Galton- Watson process conditioned on non-extinction and the expo- 
nential distribution. Here we obtain an analogous bound for the geometric 
distribution in total variation distance. The third application is to the in- 
degree of a r andomly chosen node in the uniform attachment random graph 
discussed in iBollobas et al.l (120011 ) , and the final application is to the total 
degree of both a fixed and a random ly chosen node in the preferential at- 
tachment random graph discussed in lBollobas et al.l (J200ll ). In contrast to 
our first two examples, these examples do not derive from an exponential 
approximation result. 

Finally, we mention that there are other formu lations of geo metric ap- 
proximation usin g Stein's method. For example. Pekoj ( 1996 ) and Bar- 
bour and Griibel (119951 ) use the intuition that a positive, integer-valued 



random variable W approximately has a geometric distribution with pa- 
rameter p = F(VF = 1) if 

^{W)^^{W-1\W> 1). 



Other approaches can be found in IPhillips and Weinberg) (120001 ) and I Daly 

toidt ). 



The organization of this article is as follows. In Section [2] we present 
our main theorems, and Sections [3l IH [5l and [6] respectively contain appli- 
cations to geometric sums, the critical Galton Watson process conditioned 
on non-extinction, the uniform attachment random graph model, and the 
preferential attachment random graph model. 

2 MAIN RESULTS 

A typical issue when discussing the geometric distribution is whether to 
have the support begin at zero or one. Denote by Ge{p) the geometric 
distribution with positive support; that is ^{Z) = Ge{p) if F(Z = k) = 
(1 — p)^~^p for positive integers k. Alternatively, denote by Ge (p) the 
geometric distribution Ge(p) shifted by minus one, that is "starting at 0." 
Since ^{Z) = Ge(p), implies ^{Z — 1) = Ge^{p), it is typical that results 
for one of Ge(p) or Ge^{p) easily pass to the other. Unfortunately, our 
methods do not appear to trivially transfer between these two distributions, 
so we are forced to develop our theory for both cases in parallel. 

First, we give an alternate definition of the equilibrium distribution that 
we will use in the proof of our main result. 

Definition 2.1. Let X be a positive, integer-valued random variable with 
finite mean. We say that an integer-valued random variable X'^ has the 
discrete equilibrium distribution w.r.t. X if for all bounded / and V/(x) = 
f{x) — f{x — 1) we have 

E/(X) - /(O) = EX EV/(X^). (2.1) 

Remark 2.2. To see how (12. ip is equivalent to (II. 3p . note that we have 

X oo 

E/(X) - /(O) = E J^ V/(i) =Y,^f{i)F{X ^i) = EXEV/(X^). 

In order to handle non-negative random variables, we also introduce a 
variation of Definition 12.11 



Definition 2.3. If X is a non-negative integer-valued random variable witli 
F(X = 0) > 0, we say that an integer- valued random variable X^" has 
the discrete equilibrium distribution w.r.t. X if for all bounded / and with 
A/(x) = f{x + 1) — f{x) we have 



eo> 



E/(X) - /(O) = EXEA/(X 

Note that we are defining the term "discrete equilibrium distribution" in 
both of the previous definitions, but this should not cause confusion as the 
support of the base distribution dictates the meaning of the terminology. 

As a final bit of notation before the statement of our main results, for a 
function g with domain Z, let H^H = sup^^^ Iff(^)l) ^^^ for ^W integer-valued 
random variable W and any cr-algebra J-", define the conditional smoothness 

Si{W\J^) = sup \]S{Ag{W)\T}\ 

Il9li<i (2.2) 

= 2dTY{^{W + l\T),^{W\T)), 

and the second order conditional smoothness 

52(^^1-7^) = sup \]S{A^giW)\J^}\, 

llsIKi 

where A?g{k) = Agik + 1) — /S.g{k). In order to simplify the presentation 
of the main theorems, we let di = (ixv a-nd d2 = d\oc- 

Theorem 2.1. Let W be a positive integer- valued random variable with 
'EiW = \/p for some < p ^ 1 and let W^ have the discrete equilibrium 
distribution w.r.t. W. Then with D = W — W^, any a-algebra T 5 o"(-D) 
and event A £ T we have 

di{^{W),Ge{p)) ^ ]E{\D\Si{W\J')1a}+2F{A^), (2.3) 

for I = 1,2, and 

dTv{^{W'),Ge{p))^p]S\D\, (2.4) 

dioc(^(IV^),Ge(p)) ^p]E{\D\SiiW\T)}; (2.5) 

on the RHS of ^^ and ^^, Si{W\F) can be replaced by Si{W''\F). 

Theorem 2.2. Let W be a non-negative integer-valued random variable with 
V{W = 0) > 0, ^W = (1 - p)/p for some < p i^ 1 and let W" have the 



discrete equilibrium distribution w.r.t. W . Then with D = W — W^^ , any 
a-algebra T 5 o"(D) and event A £ J^ we have 

di {^(W), Ge°(p)) ^ (1 - p)E{ \D\Si{W\:F)Ia} + 2(1 - p)F(^^) (2.6) 

for I = 1,2, and 

dTY{^{W^°),Ge\p)) ^p]S\D\, (2.7) 

dyoc{^iW''^),Ge\p)) ii p]E{\D\SiiW\T)}, (2.8) 

on the RHS of <^^ and (USD, Si{W\J') can be replaced by Si{W''^\T). 

Before we prove Theorems 12.11 and 12.21 we make a few remarks related 
to these results. 

Remark 2.4. It is easy to see that the a random variable W with law equal 
to Ge(p) has the property that ^{W) = ^{W^), so that W^ can be taken 
to be W and the theorem yields the correct error term in this case. The 
analogous statement is true for Ge'^(p) and W^° . 

Remark 2.5. By choosing A = {W = W^^} in (j2.6p with / = 1, we find 

dTv(^(Vf^),Ge°(p)) ^ 2(1 -p)F(T^ / T^^«), (2.9) 

and an analogous corollary holds for Theorem 12. 1[ 

In order to use the theorem we need to be able to construct random 
variables with the discrete equilibrium distribution. The next proposition 
provides such a construction for a non-negative integer valued random vari- 
able W . We say W^ has the size-bias distribution of VF, if 

^{Wf{W)] = ET^E/(T^') 

for all / for which the expectation exist. 

Proposition 2.3. Let W be an integer-valued random variable and let W^ 
have the size-bias distribution ofW. 

1. IfW > Q and we define the random variable W^ such that conditional 
on W^ , W^ has the uniform distribution on the integers {1, 2, . . . , W^}, 
then W^ has the discrete equilibrium distribution w.r.t. W . 

2. IfW^O with F(VF = 0) > and we define the random variable W^° 
such that conditional on W^ , W^° has the uniform distribution on 
the integers {0, 1, . . . , W^ — 1}, then W^° has the discrete equilibrium 
distribution w.r.t. W. 



Proof. For any bounded / we have 
E/(VF) - /(O) = E^V/(f) = ewe] — J^V/« I = EM^EV/(Ty^), 

which iniphes Item 1. The second item is proved analogously. D 

As mentioned in the introduction, there can be considerable technical 
difficulty in ensuring the support of the distribution to be approximated is 
smooth. In Theorems 12.11 and 12.21 this issue is accounted for in the term 
Si(W\T). Typically, our strategy to bound this term will be to write W (or 
W^) as a sum of term s which are indepe n dent given J^ and then apply the 
following lemma from iMattner and Rood ( 20071 . Corollary 1.6). 



Lemma 2.4. IMattner and Rooa 1(2001 . Corollary 1-6)) If Xi, . . . ,Xn are 



independent integer-valued random variables and 
then 




Remark 2.6. It may seem surprising that the bound ()2.4p gains an ex- 
tra factor of p, which is considered small if the geometric distribution is 
approximately exponential. To explain this phenomenon. Proposition 12.31 
indicates that W^ is already 'smooth' in the sense that there are no gaps in 
its support. More precisely, the analog of the term in (j2.3p accounting for 
the period of the support of W^ is automatically small. Heuristically, note 
that from ()2.ip we have that 

|EV/(Ty^)| ^ ^ 



JEW 



which implies that dT:y{^{W^),^{W'^ + l)) is of order (EW) ^, regardless 
of the distribution of W. 

Before we present the proof of Theorems 12.11 and 12. 2[ we must first 



develop the Stein's method machinery we will need. As in iPeko a ( 19961 ). for 



any subset B of the integers and any p = 1 — q we construct the function 
/ = fB,p defined by /(O) = and for A: ^ 1, 

qf{k) - f{k - 1) = heB - Ge{p){B}, (2.10) 



where Ge{p){B} = YliGsi^ —py~^p is the chance that a positive geometric 
random variable with parameter p takes a value in the set B. It can be 
easily verified that the solution of (|2.10p is given by 

/w = E'?"'- E '?"'"'• (2.11) 

Equivalently, for k ^ 0, 

qf{k + 1) - f{k) = heB-i - Ge\p){B - 1}, 

where we define Ge^{p\{B} analo gous to Ge{p){B}. 

Pekoz (jl996l ) and IPalyl (120081 ) study properties of these solutions, but 



we next need the following additional lemma to obtain our main result. 
Lemma 2.5. For f as above, we have 

sup|V/(A:)|=sup|A/(fc)Kl. (2.12) 

If, in addition, B = {m} for some m £ Ti, then 

(2.13) 









supIj 


^(A;)| ^ 1. 


Proof. 


To show 


^'2A2l 


note that 








yf{k) 


= > >■■ 

ieB,i^k 


ieB,i^k+l 








ieB,i^k+l 








= h^B - P 


E "-'■ 



g^^k~l 



-k-U 



ieB,i^k+l 

thus -1 ^ V/(A;) ^ 1. If now B = {m}, (|2l3l) is immediate from ([2lT]) . D 

We are now ready to present the proof of our main results. 

Proof of Theorem \2.1\ Given any positive integer- valued random variable 
W with EW^ = l/p imAD = W -W we have, using (l2lll . Definition EH 



and Lemma 12.51 in the two inequalities, 

F{W £B)- Ge{p){B} 
= lE{qf{W)-f{W-l)} 
= ]S{Vf{W)-pf{W)} 
= ]S{Vf{W)-Vf{W')} 
^ E{U(V/(VF) - Vf{W'))} + 2F{A') 

= 1EUaI[D > 0] ^ E(V/(VF^ + i + 1) - Vf{W^ + i)\T) \ 

+ e|uI[I? <0] Y^ E(V/(VF^ -i-1)- VfiW - i))\T) I + 2F{A^) 

^]S{\D\Si{W'\T)Ia} + 2F{A''), 

which is (|2.3p for / = 1; analogously, one can obtain (j2.3p with 51(1^17-") in 
place of Si{W^\J-') on the RHS. In the case of -B = {m} we can make use of 
(j2J3l) to obtain 

\F{W = m)- Ge{p){m}\ ^ ]E{\D\S2{W^\T)Ia} + 2F{A^), 

instead, which proves (j2.3p for I = 2. For ()2.4p . we have that 

P{W^ eB)- Ge{p){B} = ]E{qf{W^) - f{W^ - 1)} 

= ]S{Vf{W') - pfiW)} = p]S{f{W) - f{W')} ^ pnD\, 

where the last line follows by writing f(W) — f{W^) as a telescoping sum of 
IZ^I terms no greater than ||V/||, which can be bounded using (j2.12p : (j2.5p 
is straightforward using (|2.13p and (j2.2p . D 

Proof of Theorem \2.S\ . Let W^ be a non-negative integer- valued random vari- 
able with ET^ = (1 — p)/p and W^*-^ as in the theorem. If we define / to be 
independent of all else and such that F(I = 1) = 1 — F(/ = 0) = p, then a 
short calculation shows that the variable defined by 

[{W + 1)^|I = 1] = VF + 1 and [{W + lY\I = {)\=W^° + 1 (2.14) 

has the positive discrete equilibrium transform with respect to I4^-|-l. Equa- 
tion (12. 6p now follows after noting that 

di{^{W),Ge\p))=di{^{W + l),Ge{p)), 
10 



and then applying the following stronger variation of (|2.3p which is easily 
read from the proof of Theorem 12.11 

di{^{W + 1), Ge(p)) ^]E{\W + 1-{W + ir\ 5,(^^1^)1^} 

+ 2F {{W + iy ^W + l) FiA"). 

Equations (j2.7p and (j2.8p can be proved in a manner similar to their analogs 
in Theorem 12.11 

D 

Remark 2.7. As the proof of Theorem 12.21 shows, (j2.14p can be used to 
finesse results using the coupling {{W + 1), {W + 1)*^) for W non-negative 
to results using the coupling (VF, P^*^"). However, it is not clear how to 
implement this approach in application; for example in Section [3] below. 

3 APPLICATIONS TO GEOMETRIC SUMS 

In this section we apply the results above to a sum of the geometric number 
of independent but not necessarily identically distributed random variables. 
As in our theory above, we will have separate results for the two cases where 
the sum is strictly positive and the case where it can take on the value zero 
with positive probability. We reiterate that although there are a variety of 
exponential approximation results in the literature for this example, there 
do not appear to be bounds available for the analogous geometric approxi- 
mation in the total variation metric. 

Theorem 3.1. Let Xi,X2, ■■ ■ be a sequence of independent, square inte- 
grable, positive, and integer-valued random variables, such that for some 
u> we have, for alii ^ 1, EXj = fi and u ^ 1- dTYi^{Xi),^{Xi + l)). 
Let ^{N) = Ge(a) for some Q < a ^ I and W = j^iLi^i- Then with 
p = 1 — q = a/ II, we have 

di{^{W), Geip)) ^ Ci supE |X, - Xf K C/(M2/2 + ^ + ^) (31) 

for I = 1,2, where fi2 := supj EX? and 



Ci = min< l,a 



C2 = min< 1, a 



1 + 



1 



ulog(l — a] 
61og(a) 

TTU 



1/2' 



11 



Theorem 3.2. Let Xi,X2, ■ ■ ■ be a sequence of independent, square inte- 
grahle, non-negative and integer-valued random variables, such that for some 
u> we have, for alii ^ 1, EXj = ^ and u ^ 1- dTyi-^iXi),^{Xi-\-l)). 
Let ^{M) = Ge°(a) for some Q < a i^ I and W = Y^f^i^i- Then with 
p = 1 — q = a/{a + /i(l — a)), we have 

di{^{W),Ge'>{p)) ^ Ci supEXf ^ Ci (^2/(2/u) - \) , (3.2) 

for I = 1,2, where fi2 '■= supj EX? and the Ci are as in Theorem \3.1\ 
Before proving Theorems 13.11 and 13. 2| we make a few remarks. 



Remark 3.1. The first inequahty in (13. ip yields the correct bound of zero 
when Xi is geometric, as in this case we would have Xi = Xf; see Remark l2.4l 
following Theorem 12. II Similarly, in the case where the Xi have a Bernoulli 
distribution with expectation /x, we have that X^" = so that the left hand 
side of (j3.2p is zero. That is, if M ~ Ge^{a) and conditional on M, W has 
the binomial distribution with parameters M and /.i for some ^ ;U ^ 1, 
then W ~ Ge°(a/(a + fi{l - a))). 

Remark 3.2. The local metric result in Theorem 13.11 above will not be 
useful in the regime of exponential distribution convergence where we only 
assume a is considered small, since the probabilities being approximated are 
of smaller order (linear in a) than the error bound (alog(l/a)). However, 
in the case that the Xi are close to geometric, the local metric result may 
yield useful information. 

Remark 3.3. In the case where X j are i.i.d. but not necessarily integer 
valued and < a ^ ^ . iBrownl (1990, Theorem 2.1) obtains the exponential 
approximation result 

dK{^{W),EMyp)) ^^^ (3.3) 



for the weaker Kolmogorov metric. To compare (j3.3p with (j3.ip for small a, 
we observe that the bound (13. ip is linear in a whereas ()3.3p , within a constant 
factor, behaves like a(— log(l — a))~^'^ ~ y^. Therefore the bound (|3.3p 
is better, but (j3.ip applies to non-i.i.d. random variables (albeit having 
identical means) and to the stronger total variation metric. 

Proof of Theorem\TJ\ First, let us prove that W := Ylf=i^ ^i + ^n ^^s 
the discrete equilibrium distribution w.r.t. W, where, for each i ^ 1, Xf is a 



12 



random variable having the equilibrium distribution w.r.t. Xj, independent 
of all else. Note first that we have for bounded / and every m, 



m—l 



/xEV/ J^Xi + X, 



j=i 



E 



i=l 



fT.^']-fT.^ 



m— 1 



i=l 



Note also that since N is geometric, for any bounded function g with g{0) = 
we have 

]S{giN)-g{N-l)}=a-Eg(N). 

We now assume that /(O) = 0. Hence, using the above two facts and 
independence between N and the sequence Xi,X2, . . . , we have 

ET^EV/(VF") = -EV/( ^X, + Xf^j 

r / ^ \ /^"^ \ 



1 

-E 
a 






Now, 

D = W-W^ = XN-Xf^, 

and setting J^ = a{N, Xf^^Xj^f), we have 
SiiW'lT) = SAY. ^i -^J SSIA 



^ 1 A 



7r(0.25 + {N - l)n) 

2 ^l/2 



1/2 



7r(iV- l)n 



(3.4) 



where we have used Lemma [2^ and the fact that Si{W'^\T) is almost surely 
bounded by one. We now have 



]S[\D\Si{W^\T)] ^ E 



1 A 



7r(iV- l)u 



1/2 



.N 



E'" \Xn-X 



N\ 



(3.5) 



From here, we can obtain the first inequality in p.ip by applying Theo- 
rem 12.11 after noting that 

E^ \Xn - X^l ^ supEjXi - Xf I, 



13 



and 

El lA 



7r(iV - l)u 



1/2 



s^ 1 A a + 



2 \^/^^a(l-a)'^ 



iru 






jl/2 



TT 



\7ruJ V log(l-a) 



1/2N 



1 A a 



1 + 



log(l — a)uj 



1/2' 



where we have used 
a(l - a)*-i 



4 1 



•j>i 



/2 



^ 



l-a 



(l-aY 



a; 



1/2 



dx 



TT 



1 — a\ log(l — a) 



1/2 



The second inequahty in (j3.ip follows from Theorem 12.11 and the fact (from 
the definition of the transformation X'^) that IE \Xj\j- — X^| ^ //2/2 + 5 "l"/^- 
To obtain the local limit result, note that, ii V = X + Y is the sum of 
two independent random variables, then S2{V) ^ Si{X)Si{Y). Hence, 



S2iW^\J^) ^ 1 A 



From here we have 



^ 1 A 



6 



7r(0.25 + {N/2 - l)+u) 7r{N - l)u 



i>l 



1 A a 



6alog(a^ 



■KU 



D 



Proof of Theorem fXH It is straightforward to check that W^" := Xli^i ^1 + 
X^°j,^ has the equilibrium distribution with respect to W. Now, 



D = W -W''° 
and setting J^ = a{M, X^ , -^), we have 
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which can be bounded above by (j3.4p as in the proof of Theorem 13.11 The 
remainder of the proof follows closely to that of Theorem 13.11 For example, 
the expression analogous to (j3.5p is 



E[|L>|5i(VF^«|J')] ^E 



min < 1 






and the definition of the transform X^o implies that 



EX 



eo 



EX| 
2/i 



1 



D 



APPLICATION TO THE CRITICAL GALTON- WATSON 
BRANCHING PROCESS 



. . be a Galton- W atson branchixi g process with offspring 
A theorem due to lYaglomI (jl947l ) states that, if EZi = 1 



Let Zq = 1, Zi, Z2, 
distribution ^{Zi] 
and Var Zi = a"^ < 00, then ^{n~^Zn\Zn > 0) co nverges to an exponentia l 



distribution with mean cr^/2. The recent article iPekoz and RollinI ( 201ll ) 
is the first to give an explicit bound on the rate of convergence for this 
asymptotic result. Using ideas from there, we give a convergence rate for the 
total variation error of a geometric approximation to Zn under finite third 
moment of the offspring distribution and the natural periodicity requirement 
that 

dTy{^{Zi),^{Zi + l))<l. (4.1) 

This type of smoothness condition is typical in the c ontext of Stein's method 
forapproximation by a d iscrete distribution; see e.g. iBarbour and Cekanavicius 
(|2002h andlRollid (|2008h . 

For the proof of the following th eorem, we make use the of construction of 
Lvons. Pemantle. and Pered ( 19951 ); we refer to that article for more details 
on the construction and only present what is needed for our purpose. 

Theorem 4.1. For a critical Galton-Watson branching process with off- 
spring distribution ^(Z\) such that MZf < 00 and ()4.ip holds, we have 



dTY{^{Zn\Zn>0),Ge{^)) ^ 

for some constant C which is independent of n. 



Clogn 

ni/4 



(4.2) 
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Remark 4.1. Pekoz and RollinI (J201ll . Theorem 4.1) gives the result 

dK(^(2^n/(a2n)|Z„ > 0),Exp(l)) ^ C l}^\ (4.3) 

without Condition (|4.ip for the weaker Kolniogorov metric. It can be seen 
that the bound in (j4.2p is not as good as the bound in (|4.3p for large n, but 
()4.2p apphes to the stronger total variation metric. 

Proof of Th e orem UAX First we construct a size-biased branching tree as in 



Lyons et al.l ()1995l ). We assume that this tree is labeled and ordered, in the 
sense that, if w and v are vertices in the tree from the same generation and 
w is to the left of v, then the offspring of w is to the left of the offspring 
of V. Start in generation with one vertex vq and let it have a number of 
offspring distributed according to the size-bias distribution of ^{Zi). Pick 
one of the offspring of -yo uniformly at random and call it vi. To each of 
the siblings of vi attach an independent Gallon- Watson branching process 
with offspring distribution ^{Zi). For vi proceed as for t>o, i.e., give it a 
size-biased number of offspring, pick one uniformly at random, call it V2, 
attach independent Gallon- Watson branching process to the siblings of V2 
and so on. It is clear that this will always give an infinite tree as the "spine" 
f^o ) f^i ) "^2 ) • • • is an infinite sequence where Vi is an individual (or particle) in 
generation i. 

We now need some notation. Denote by Sn the total number of particles 
in generation n. Denote by L„ and Rn the number of particles to the 
left (excluding Vn) and to the right (including Vn), of vertex Vn, so that 
Sn = Ln + Rn- We Can describe these particles in more detail, according to 
the generation at which they split olf from the spine. Denote by Snj the 
number of particles in generation n that stem from any of the siblings of 
Vj (but not Vj itself). Clearly, 5„ = 1 -|- Yll=iSn,j-, where the summands 
are independent. Likewise, let Lnj and Rnj, be the number of particles in 
generation n that stem from the siblings to the left and right of Vj (note 
that Ln^n and Rn,n are just the number of siblings of Vn to the left and 
to the right, respectively). We have the relations L„ = J21=i^n,j and 
Rn = 1 + X^?=i Rn,j- Note that, for fixed j, Lnj and Rnj are in general not 
independent, as they are linked through the offspring size of Vj-i. 

Let now R'^ • be independent random variables such that 

■^i^n,j) = -^{RnjlLnJ = 0). 

and, with Anj = {Lnj = 0}, define 

Rn,j = Rn,jlA„j + Rn,jlAl. = Rn,j + {R'n,j " ^n,j)lA^^.- 
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Defin e also Rl = 1 + y^?_] R* j- Let us collect a few facts from Pekoz and 



Rollin (J2OIII ) which we will then use to give the proof of the theorem (here 
and in the rest of the proof, C shall denote a constant which is independent 
of n, but may depend on ^(Zi) and may also be different from formula to 
formula) : 

(i) ^{R*J = ^{Zn\Zn > 0); 

(a) Sn has the size-biased distribution of Z„, 

and Vn is equally likely to be any of the Sn particles; 
(m) E{</^cJ^c72F[^y; 
(iv) lE{RnjlA'^J ^ tFKj], and E{i?„_i,,lAc J ^ 7FKJ, 

where 7 = EZ^ ; 
{v) F[AIJ ^ a'^F[Zn^j > 0] ^ C/{n - j + 1) for some C > 0. 

In light of (i) and (ii) (and then using the construction in Proposition 12.31 
to 

see that i?„ has the discrete equilibrium distribution w.r.t ^(i?*)) we 
can let W = R^, W = Rn and 

n 

D = Rl- Rn = 2J(-Rn,j - ^,i)^K,j- 
Also let 

n— 1 n 

N = Y,Rn-i,jlA^^^^ and M = ^RnjlA!^^ 

and note that {iii)-{v) give 

EIDI ^ Clogn and EiV ^ Clogn. (4.4) 

Next with J" = a[N, D, Rn-i,M, i?„^„Iyi„^„) and, letting Zl, i = 1,2, ... , 
be i.i.d. copies of Zi, we have 

which follows since Rn—M = 1+X^"^]^ i?„jl^^ ^ and the particles counted by 
Rn-i — N will be parents of the particles counted by i?„ — M — 1 + i?„ ,„I^^ ^ . 
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Then we use Lemma 12.41 to obtain 



Si{W'\F) = SiiRn -M- Rn,nlA„,„ " 1|-^) 

08 (4.5) 

^ (0.25 + (i2„_i-iV)n)V2- 

As a direct corollary of (j2.4p . for any bounded function / we have 

E/(W^) < E/(Xp) +p\\f\mW' - W\, (4.6) 

where Xp ~ Ge(i3). Fix q = l/]E[Zn-i\Zn-i > 0], k = q~^/^ and let 

A = {N ^k, \D\ ^ k, Rn-i > 2k}, 

and 

fix) = {X- k)~^l^h^2k+l. 

Using (gaD, (133D, (USD, and the fact that ||/|| ^ /c-^s^ we find 



E[/(X,)]^,^il^^(,vr)V2 
i=i ■' 

to obtain 

E[|I)|5i(VF^|^)U] ^ ku-^/^¥.f{Rn-i) 

^ A;7x-V2(E/(X,) + #-i/2e|2?„_i|) 
^Cgi/^logn, 

where -Dn-i = -Rn-i — -^n-i- Now, applying (j2.4p yields 

F(fi„_i ^ 2A;) ^ 1 - (1 - qf^ + gE|Z)„„i| ^ q{2k + E|D„_i|), 

and by Markov's inequality and (j4.4p we finally obtain 

F(^^) ^ A:-^(EAf + E|L>|) + q{2k + E|L>|) ^ Cq^'^logn. 

The theorem follows after using [v) and EZ.„ = 1 to obtain 'Ei[Zn\Zn > 0] ^ 
Cn. D 
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5 APPLICATION TO THE UNIFORM ATTACHMENT RANDOM 

GRAPH MODEL 

Let Gn be a directed random graph on n nodes defined by the fohowing 
recursive construction. Initially the graph starts with one node with a single 
loop where one end of the loop contributes to the "in-degree" and the other 
to the "out-degree." Now, for 2 ^m ^ n, given the graph with m — 1 nodes, 
add node m along with an edge directed from m to a node chosen uniformly 
at random among the m nodes present. Note that this model allows edges 
connecting a node with itself. This random graph model is referred to as 
uniform attachment. 



This model has been well studied, and it was shown in iBollobas et al 



(|200ll ) that if W is equal to the in-degree of a node chosen uniformly at 
random from Gn, then W converges to a geometric distribution (starting at 
0) with parameter 1/2 as n — t- oo. We will give an explicit bound on the 
total variation distance between the distribution of W and the geometric 
distribution that yields this asymptotic. Some related results for this an d 



other random graph models were given using Stein's method in lFordI ( 20091 ) 



where the author obtained the same rate found in our result below (but 
with larger constant) using an ad hoc analysis of the model. As we will 
see, our framework grants easy access to the result of iFordI ( 20091 ) for this 



model. It's worthwhile noting that this type of result is not obtainable using 
exponential approximation. 

Theorem 5.1. IfW is the in-degree of a node chosen uniformly at random 
from the random graph Gn generated according to uniform attachment, then 

dTv(^(VF),GeO(i))^^. 

Proof of Theorem \5.1[ Let Xi have a Bernoulli distribution, independent of 
all else, with parameter Hi := {n — i + 1)~^, and let A^ be an independent 
random variable that is uniform on the integers l,2,...n. If we imagine 
that node n + 1 — A^ is the randomly selected node, then it's easy to see that 
we can write W := "^^^i Xi. 

Next, let us prove that X^j=7 -^i ^^^ ^^^ discrete equilibrium distribution 
w.r.t. W. First note that we have for bounded / and every m, 



where we use 



E/(X^)-/(0) = EX„EA/(0) 
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and thus the fact that we can write (XmY*^ = 0. Note also that for any 
bounded function g with g{0) = we have 

\ fJ'N fJ-N J 

We now assume that /(O) = 0. Hence, using the above two facts and 
independence between N and the sequence Xi,X2, . . . , we have 






Now, let 

N' 



N if 1 ^ iV < n, 
if A^ = n. 



We have that ^{N') = ^{N - 1) so that 

N' 
W^° := Y^ Xi 

i=l 

has the equilibrium distribution with respect to W and it is plain that 

n 
Applying (j2.9p of Remark 12.51 yields the theorem. D 

6 APPLICATION TO THE PREFERENTIAL ATTACHMENT 
RANDOM GRAPH MODEL 

Define the directed graph G.„ on n nodes by the following recursive con- 
struction. Initially the graph starts with one node with a single loop where 
one end of the loop contributes to the "in-degree" and the other to the "out- 
degree." Now, for 2 ^ 771 ^ n, given the graph with m—1 nodes, add node m 
along with an edge directed from tti to a random node chosen proportional 
to the total degree of the node. Note that at step m, the chance that node 
m connects to itself is l/(2m — 1) since we consider the added vertex m as 
immediately having out-degree equal to one. This random graph model is 
referred to as preferential attachment. 



This model has been well studied, and it was shown in iBollobas et al. 



(|200ll ) that if W is equal to the in-degree of a node chosen uniformly at 
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random from G„, then W converges to the Yule-Simon distribution (defined 
below). We will give a rate of convergence in the total variation distance 
for this asymptotic, a result that cannot be read from the main results of 



Bollobas et alj (J200ll ) . Some rates of co nvergence in this and related random 



graph models can be found in the thesis iFordI (J2009l ) , but the techniques and 



results there do not appear to overlap with ours below. 

We say the random variable Z has the Yule-Simon distribution if 

■^'^ = ^- ^^+1X^ + 2) - '='■' 

The following is our main result. 

Theorem 6.1. Let Wn,i be the total degree of vertex i in the preferential 
attachment graph on n vertices and let I uniform on {!,... , n} independent 
ofWn,i- If Z has the Yule-Simon distribution, then 

for some constant C independent of n. 

Remark 6.1. The notation ^{Wn,i) in the statement of Theorem 16.11 
should be interpreted as 

^ {Wn,l\I = i) = ^ {Wn,^) ■ 

We will use similar notation without remark frequently in the sequel. 

At this point, the reader may wonder where the geometric distribution 
enters into the discussion above. The following elementary proposition clar- 
ifies this point. 

Proposition 6.2. IfU has the uniform distribution on (0, 1), and given U , 
we define Z such that ^{Z) = Ge(vt7), then Z has the Yule-Simon distri- 
bution. 

Our strategy to prove Theorem 16.11 will be to show that for / uniform 
on {1, . . . , n} and U uniform on (0, 1) we have 



1. dTv(i?(^n,/),Ge(E[Vr„j|/]-i)) ^C\og{n)/n, 



2. dTv(Ge(E[W^n,/|/]-i),Ge(777n)) ^ Clog(?i)/n, 

3. dTv(Ge(y77iV),Ge(^)) ^ Clog(n)/n, 
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where here and in what foUows we use the letter C as a generic constant 
which may differ from hne to hne. Prom this point, Theorem 16.11 follows 
from the triangle inequality and Proposition I6.2[ 

Item [1] will follow from our framework above; in particular we will use 
the following result which may be of independent interest. We postpone the 
proof to the end of the section. 

Theorem 6.3. Retaining the notation and definitions above, we have 

for some constant C independent of n and i. 

To show Items [2] and [3l we will need the following lemma. The first 
statement is found in lBollobas et al.l (|200ll ). p. 283 and the second follows 
easily from the first. 



Lemma 6.4 (JBollobas et al.l (J200ll )). Retaining the notation and definitions 
above, for all 1 ^ i ^ n, 



^Wn, 






and 



^Wn, 



< 



c 



'm 



Our final general lemma is useful for handling total variation distance 
for conditionally defined random variables. 

Lemma 6.5. Let W and V be random variables and let X be a random 
element defined on the same probability space. Then 

dTyi^iW),^iV)) ^ ]EdTvi^{W\X),^{V\X)). 

Proof If / :1R^ [0, 1], then 



mfiw) - f{v)]\ ^ nnfiw) - f{v)\x]\ 

^]SdTY{^{W\X),^{V\X)). 



D 



Armed with these lemmas, we can prove Theorem 16.1 
Proof of Theorem \6.1[ We first claim that 

dTv{Ge{p),Ge{p-e)) ^ - ^ -^. 

p p — e 



(6.1) 
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The second inequality of ()6.ip is obvious and the first fohows by couphng 
the Bernoulh sequences that generate the geometric variables via an infinite 
i.i.d. sequence of random variables uniform on (0, 1) in the usual way. 
Using (j6.ip and Lemma 16.41 we easily obtain 

dTv(Ge(l/EW^„,i),Ge(v^)) ^ j 

and applying Lemma 16.51 we find 

Clog(n) 



dTv(Ge(E[Vr„,,|/]-i),Ge(v^)) ^ 



n 



which is Item [2] above. 

Now, Coupling [/ to / by writing U = I/n — V, where V is uniform 
on (0, 1/n) and independent of /, and using first (16. ip and then Lemma 16.51 
leads to 



«=i V / 

which is Item [3] above. 

Finally, applying Lemma 16.51 to Theorem 16.31 vields the claim related to 
Item [1] above so that Theorem 16.11 is proved. D 



The remainder of this section is devoted to the proof of Theorem 16. 3| 
recall Wn,i is the total degree of vertex i in the preferential atttachment 
graph on n vertices. Since we want to apply our geometric approximation 
framework using the equilibrium distribution, we will use Proposition 12.31 
and so we first construct a variable having the size-bias distribution of Wn^i — 
1. To facilitate this construction we need some auxiliary variables. 

For j ^ i, let Xj^i be the indicator variable of the event that vertex j 
has an outgoing edge connected to vertex i in Gj so that we can denote 
Wj^i = 1 + X]|,=j ^fc,j- In this notation, for 1 ^ i < j ^ n. 



nXj, = 1|G,_0 



and for 1 < i < n, 



F{Xi^i = 1\G. 



i-lj 



Wj^ 


l,i 


2j- 


-1 


1 





2i-l 



The following well known result (see P roposition 2.2 and the discussion 
after in lChen. Goldstein, and Shad ( 201ll )) will allow us to use this decom- 



position to size-bias Wn,i — 1. 

23 



Proposition 6.6. Chen et al\ 120 li ) Let Xi,. . . ,Xn be zero-one random 
variables with F(Xj = 1) = pj. For each i = 1, . . . ,n, let (X^^')jjii have the 
distribution of {Xj)j^i conditional on Xi = 1. If X = "^^^i Xi, /i = ]E[X], 
and K is chosen independent of the variables above with V{K = k) = pk/f^, 



.{K) 



then X^ = ^j-^x-^i + ^ ^^^ ^^^ size-bias distribution of X. 



Roughly, Proposition 16.61 implies that in order to size-bias Wn,i — 1, we 
choose an indicator X^^i where for k = i, . . . ,n, W{K = k) is proportional 
to F(X„ fc = 1) (and zero for other values), then attach vertex K to vertex i 
and sample the remaining edges conditional on this event. Note that given 
K = k, in the graphs Gi, 1 ^ I < i and k < I ^ n, this conditioning does 
not change the original rule for generating the preferential attachment graph 
given Gi-i. The following lemma implies the remarkable fact that in order 
to generate the graphs Gi for i ^ I < k conditional on Xk^i = 1 and Gj-i, we 
attach edges following the same rule as preferential attachment, but include 
the edge from vertex k to vertex i in the degree count. 

Lemma 6.7. Retaining the notation and definitions above, for i ^ j < k 
we have 



^{Xj,i — l\Xk^i — l,Gj-i) 



1 + Wj-i,i 



2j ' 
where we define Wi-i^i = 1. 

Proof. By Bayes' rule, we have 

F{X,,^ = l\Xk,i = l,GJ-^) 

^ P{Xj,i = l\Gj.i)F{Xk,i = l\Xj^i = 1, Gj^i) (6.2) 

F{Xk,, = l|G,_i) 

and we will calculate the three probabilities appearing in (|6.2p . First, for 
i ^ j, we have 

F{Xj,i = l|G,_i) 



Wj-i, 



2j-l 
which implies 



2k -1 
and 

F(Xfc,i = l|X,-i = l,G,-_i) = ^' ''*' ''' ' ' '^ 



2k -1 
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Now, to compute the conditional expectations appearing above, note 
first that 

nWkAGk-i) = Wk-i,i + ^^ = (^^) Wk-i,i, 

and thus 

Iterating, we find that for i,s < k, 

E{l^w|G.-.) = n(^|^f^)w.-.,. (6.3) 

By setting j — 1 = k — s and then replacing /c — 1 by /c in (I6.3P we obtain 

which also imphes 

k—j — l 

Substituting these expressions appropriately into (j6.2p proves the lemma. 

D 

The previous lemma suggests the following (embellished) construction 
of (Wn.il-'^fe = 1) for any 1 ^ i ^ k ^ n. Here and below we will denote 
quantities related to this construction with a superscript k. First we gen- 
erate G^_i, a graph with i — 1 vertices, according to the usual preferential 
attachment model. At this point, if i 7^ fe, vertex i and k are added to the 
graph, along with a vertex labelled i' with an edge to it einanating from 
vertex k. Given G^_^ and these additional vertices and edges, we generate 
G^ by connecting vertex i to a vertex j randomly chosen from the vertices 
1, . . . , i, i' proportional to their degree, where i has degree one (from the out- 
edge) and i' has degree one (from the in-edge emanating from vertex k). If 
i = k, we attach i to i' and denote the resulting graph by G]. For i < j < k, 
we generate the graphs G^ recursively from G'^_^ by connecting vertex j to 
a vertex I randomly chosen from the vertices 1, . . . ,j,i' proportional to their 
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degree, where j has degree one (from the out-edge). Note that none of the 
vertices I, . . . ,k — 1 can connect to vertex k. We now define G^ = G^_^, 
and for J = fc + 1, . . . , n, we generate Gj from Gj-i according to preferential 
attachment among the vertices 1, . . . ,j,i' . 

The following lemma summarizes relevant properties of this construction. 

Lemma 6.8. Let 1 ^ i ^ k ^ n and retain the notation and definitions 
above. 

1. ^{Wl, + Wl^ = ^{WnAXk = 1). 

2. For fixed i, if K is a random variable such that 

then Wjfj + W^^i — 1 has the size-bias distribution of Wn^i — 1- 

3. Conditional on the event {W^^ + W^^, = m + l], the variable W^^ is 
uniformly distributed on the integers 1, 2, ... m. 

4- W^^ — 1 has the discrete equilibrium distribution ofWn,i — 1- 

Proof. Items 1 and 2 follow from Propostion 16.61 and Lemma 16. 7i Viewing 
(VF^'j, W^j^i) as the number of balls of two colors in a Polya urn model started 
with one ball of each color, Item 3 follows from induction on m, and Item 4 
follows from Proposition 12.31 D 

With this setup, we are ready to give a proof of Theorem 16.31 

Proof of Theorem\6M We wih apply Theorem O to ^{Wn,i - 1), so that 
we must find a coupling of a variable with this distribution to that of a 
variable having its discrete equilibrium distribution. 

For each fixed k = i, . . . ,n we will construct (Xf,,X'l^,) ... so (X^A ... 

and (X^i) ... are distributed as the indicators of the events vertex j connects 

to vertex i in Gjj and G^, respectively. We will use the notation 

m=i m=i 

which will be distributed as the degree of vertex i in the appropriate graphs. 
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The constructions for k = i and k > i differ, so assume here that k > i. 
Let C/^j be independent uniform (0, 1) random variables and first define 

x5 = l[C/i:,<l/2z] and X^, = l[U^^, < l/{2i - 1)]. 

Now, for i < j < k, and assuming that {Wj^_^ j, W^_^ j) is given, we define 



^h = ^ 






2j 



and Xf, = I 






2j-l 



(6.4) 



For j = k we set X^^ = and X^^ as in (j6.4p with j = k, and for j > k we 
define 



X 






j.« 



and X|i = I 



'" 2j-l 



Thus we have recursively defined the variables (^Xj^,X^-^ and it is clear 
they are distributed as claimed with {W^^^, W^^ distributed as the required 
degree counts. Note also that Wf^ ^ W^^ and X^^ ^ X^^. We also define 



the events 



j,« 



3,1 



J:« 



hi' 



4, := {min{i ^l^n: X^, / X,^} = j}. 



Using that Wj^_]^ ^ = M^fL^^ ^ under A^- (which also implies A'^- = % for j > /c) 
we have 



j=i 



n 



Wi^ 



J-l.« , Tjk ^ J-l,« 



23 ^'' 2j - 1 

"k 



]=t 



2j 



^'^ 2j - 1 



(6.5) 



where we write W!'_i ^ := W^L^ ^ := 1. Finally, starting from ()6.5p and using 
the computations in the proof of Lemma 16.71 and the estimates in Lemma 
6.4[ we find 



H^i. + <.) ^ ^^ + E EM^,_i, ^ ' 



2A; 



j=i 



1 



2j - 1 2j 
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^C/i. 

If k = i, it is clear from the construction preceding Lemma 16.81 that an easy 
coupUng similar to that above will yield F(W^ ^ / W^ J < C/i. Since these 
bounds do not depend on A;, we also have 

P{W^^, - 1 / W^^, - 1) ^ C/i, (6.6) 

and the result now follows from Lemma 16.81 (|6.6p , and (j2.9p of Remark [ 
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